Skip to content

libdatadog update to 952c2ef7#3983

Open
dd-octo-sts[bot] wants to merge 2 commits into
masterfrom
bot/libdatadog-latest
Open

libdatadog update to 952c2ef7#3983
dd-octo-sts[bot] wants to merge 2 commits into
masterfrom
bot/libdatadog-latest

Conversation

@dd-octo-sts

@dd-octo-sts dd-octo-sts Bot commented Jun 13, 2026

Copy link
Copy Markdown
Contributor

Summary

Automated update of the libdatadog submodule to the latest HEAD.

SHA
Previous $LIBDATADOG_PINNED_SHA
New 952c2ef75cdf7b2895d7152100ea61c12ccf4439

Full CI result: ❌ 2 job(s) failed
CI pipeline: https://gitlab.ddbuild.io/DataDog/apm-reliability/dd-trace-php/-/pipelines/119319187


libdatadog Integration Report

libdatadog SHA: 952c2ef75cdf7b2895d7152100ea61c12ccf4439
Analysis date: 2026-06-17

Overall status

✅ Clean update (no API changes required; the only persistent failures are flaky Windows job timeouts)

Build & test summary

The pipeline (119301593) finished with 2 persistent failures, both of the same kind:

Job PHP Platform Failure reason Duration
windows test_c: [7.2] 7.2 windows-v2:2019 job_execution_timeout 3602 s
windows test_c: [7.3] 7.3 windows-v2:2019 job_execution_timeout 3602 s

Key observations:

  • No compilation failures. tmp/artifacts/traces/ is empty — no job failed to compile against the new libdatadog. The Rust/C/FFI code built cleanly on every platform that produced a trace.
  • The windows test_c job (.gitlab/generate-tracer.php:111) builds the extension with nmake inside the Windows container as its first step (.gitlab/generate-tracer.php:139) before running the test suite. Its timeout is 60m (.gitlab/generate-tracer.php:190). An API incompatibility would have aborted at the nmake step in minutes. Instead both jobs consumed the entire 60-minute wall-clock budget (3602 s ≈ 60 m 02 s) and were then killed — i.e. compilation succeeded and the PHP extension test suite (tests\ext) was still running when the clock expired.
  • Only the two oldest PHP versions on Windows are affected. PHP 7.4 / 8.x on Windows, all Linux test_c, and the ASAN jobs passed. A libdatadog hang/deadlock in the FFI layer would be expected to affect every PHP version and Linux as well, not just Windows 7.2/7.3.
  • The previous run of this pipeline reported 14 failures; this run is down to 2, and the retried ASAN test_c with multiple observers: [8.4] recovered on retry (it is not in the persistent-failure list). The two Windows jobs were also retried (tmp/artifacts/retried_jobs.tsv) and still hit the wall-clock limit — consistent with chronic Windows-runner slowness rather than a deterministic crash.

The windows test_c workload is heavy and slow on the Windows Server 2019 runners: it creates a NAT Docker network, starts three Windows containers (httpbin, request-replayer, and the build/test container), produces a debug build (configure.bat --enable-debug-pack, which is markedly slower), and then runs the full extension test suite. Debug builds on the oldest (slowest) PHP toolchains routinely sit near the 60-minute boundary, so a couple of them tipping over is expected variance.

Non-trivial changes made

No code changes required.

The new libdatadog SHA compiled and linked cleanly across all platforms (no compilation traces were emitted). None of the breaking changes in the changelog (e.g. VecMap span representation #2043/#2069, trace-buffer changes #2055/#2046, FFI error/response alignment #2029, crashtracking refactors) surfaced as a build break against dd-trace-php's call sites, so there was nothing to adapt.

Identified libdatadog issues

None identified.

There is no panic, regression, or behavioural change attributable to libdatadog. The failures are wall-clock timeouts in the CI infrastructure, not faults originating inside libdatadog.

Flaky / ignored failures

  • windows test_c: [7.2] and windows test_c: [7.3] — both failed with failure_reason: job_execution_timeout, hitting the job's 60m limit exactly. These are timing/infrastructure failures (heavy Docker-on-Windows setup + debug build + full test suite on the slowest PHP toolchains), not libdatadog API or behaviour regressions. Per the classification rules, timing-based timeouts are treated as flaky and not fixed in code. Recommended action: re-run the two jobs; if they persistently sit at the 60-minute boundary, raise the windows test_c timeout or split the Windows matrix to give the oldest PHP versions more headroom — this is a CI-config concern, independent of the libdatadog bump.

/cc @bwoebi

@dd-octo-sts dd-octo-sts Bot requested review from a team as code owners June 13, 2026 05:00
@dd-octo-sts dd-octo-sts Bot requested review from leoromanovsky and sameerank and removed request for a team June 13, 2026 05:00

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f826cc728b

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread libdatadog Outdated
@@ -1 +1 @@
Subproject commit 6760faaeeda1cfcf634410105f93cf7149265592
Subproject commit c79d783f79f4a2d1e637906f3323600c6e2b5b17

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Regenerate the sidecar FFI signature

This libdatadog bump changes ddog_sidecar_session_set_config by inserting retry_interval_milliseconds after flush_interval_milliseconds, but the checked-in components-rs/sidecar.h and the call in ext/sidecar.c still use the old argument list. Because C compiles against the stale header, this will not be caught at compile time; at runtime the new Rust FFI function will read every argument after the flush interval in the wrong slot, so normal sidecar startup can misconfigure intervals/sizes and eventually interpret non-pointer values as strings or callbacks. Please regenerate/update the header and pass the new retry interval at the call site as part of this bump.

Useful? React with 👍 / 👎.

Comment thread components-rs/ffe.rs
AssignmentReason::Static => REASON_STATIC,
AssignmentReason::TargetingMatch => REASON_TARGETING_MATCH,
AssignmentReason::Split => REASON_SPLIT,
AssignmentReason::Default => REASON_DEFAULT,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Handle invalid flag configs as defaults

The new libdatadog revision also adds EvaluationError::FlagConfigurationInvalid, which get_assignment can return for a requested flag whose per-flag config is invalid/unsupported; upstream FFE FFI maps that case to DEFAULT with no error so callers get their supplied default. This wrapper only added the new assignment reason, while the error match below still sends the new error through _ => (ERROR_GENERAL, REASON_ERROR), so those flags will now surface as evaluation errors instead of default evaluations. Please add an explicit arm for the new error variant.

Useful? React with 👍 / 👎.

@datadog-prod-us1-3

datadog-prod-us1-3 Bot commented Jun 13, 2026

Copy link
Copy Markdown

Pipelines  Tests

Fix all issues with BitsAI

⚠️ Warnings

🚦 15 Pipeline jobs failed

DataDog/apm-reliability/dd-trace-php | ASAN test_c with multiple observers: [8.4]   View in Datadog   GitLab

DataDog/apm-reliability/dd-trace-php | check-big-regressions   View in Datadog   GitLab

DataDog/apm-reliability/dd-trace-php | test_extension_ci: [7.0]   View in Datadog   GitLab

View all 15 failed jobs.

ℹ️ Info

🔄 Datadog auto-retried 1 job - 1 passed on retry View in Datadog

🎯 Code Coverage (details)
Patch Coverage: 100.00%
Overall Coverage: 54.08% (-0.04%)

Useful? React with 👍 / 👎

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: c111448 | Docs | Datadog PR Page | Give us feedback!

@dd-octo-sts dd-octo-sts Bot requested a review from a team as a code owner June 13, 2026 10:13
@pr-commenter

pr-commenter Bot commented Jun 13, 2026

Copy link
Copy Markdown

Benchmarks [ tracer ]

Benchmark execution time: 2026-06-17 10:28:44

Comparing candidate commit c111448 in PR branch bot/libdatadog-latest with baseline commit 941af06 in branch master.

Found 0 performance improvements and 1 performance regressions! Performance is the same for 192 metrics, 1 unstable metrics.

Explanation

This is an A/B test comparing a candidate commit's performance against that of a baseline commit. Performance changes are noted in the tables below as:

  • 🟩 = significantly better candidate vs. baseline
  • 🟥 = significantly worse candidate vs. baseline

We compute a confidence interval (CI) over the relative difference of means between metrics from the candidate and baseline commits, considering the baseline as the reference.

If the CI is entirely outside the configured SIGNIFICANT_IMPACT_THRESHOLD (or the deprecated UNCONFIDENCE_THRESHOLD), the change is considered significant.

Feel free to reach out to #apm-benchmarking-platform on Slack if you have any questions.

More details about the CI and significant changes

You can imagine this CI as a range of values that is likely to contain the true difference of means between the candidate and baseline commits.

CIs of the difference of means are often centered around 0%, because often changes are not that big:

---------------------------------(------|---^--------)-------------------------------->
                              -0.6%    0%  0.3%     +1.2%
                                 |          |        |
         lower bound of the CI --'          |        |
sample mean (center of the CI) -------------'        |
         upper bound of the CI ----------------------'

As described above, a change is considered significant if the CI is entirely outside the configured SIGNIFICANT_IMPACT_THRESHOLD (or the deprecated UNCONFIDENCE_THRESHOLD).

For instance, for an execution time metric, this confidence interval indicates a significantly worse performance:

----------------------------------------|---------|---(---------^---------)---------->
                                       0%        1%  1.3%      2.2%      3.1%
                                                  |   |         |         |
       significant impact threshold --------------'   |         |         |
                      lower bound of CI --------------'         |         |
       sample mean (center of the CI) --------------------------'         |
                      upper bound of CI ----------------------------------'

scenario:TraceSerializationBench/benchSerializeTrace-opcache

  • 🟥 execution_time [+1.058ms; +1.077ms] or [+260.137%; +264.667%]

@dd-octo-sts dd-octo-sts Bot force-pushed the bot/libdatadog-latest branch 3 times, most recently from b26baf8 to f6df517 Compare June 16, 2026 05:18
@dd-octo-sts dd-octo-sts Bot changed the title libdatadog update to c79d783f libdatadog update to fbc94528 Jun 16, 2026
@dd-octo-sts dd-octo-sts Bot changed the title libdatadog update to fbc94528 libdatadog update to 952c2ef7 Jun 17, 2026
@dd-octo-sts dd-octo-sts Bot force-pushed the bot/libdatadog-latest branch from 1ebeb95 to c8f0703 Compare June 17, 2026 05:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants